Goto

Collaborating Authors

 robotic platform


Control Your Robot: A Unified System for Robot Control and Policy Deployment

Nian, Tian, Ke, Weijie, Zhu, Shaolong, Hu, Bingshan

arXiv.org Artificial Intelligence

Cross-platform robot control remains difficult because hardware interfaces, data formats, and control paradigms vary widely, which fragments toolchains and slows deployment. To address this, we present Control Your Robot, a modular, general-purpose framework that unifies data collection and policy deployment across diverse platforms. The system reduces fragmentation through a standardized workflow with modular design, unified APIs, and a closed-loop architecture. It supports flexible robot registration, dual-mode control with teleoperation and trajectory playback, and seamless integration from multimodal data acquisition to inference. Experiments on single-arm and dual-arm systems show efficient, low-latency data collection and effective support for policy learning with imitation learning and vision-language-action models. Policies trained on data gathered by Control Your Robot match expert demonstrations closely, indicating that the framework enables scalable and reproducible robot learning across platforms.


DINO-CVA: A Multimodal Goal-Conditioned Vision-to-Action Model for Autonomous Catheter Navigation

Fekri, Pedram, Roshanfar, Majid, Barbeau, Samuel, Famouri, Seyedfarzad, Looi, Thomas, Podolsky, Dale, Zadeh, Mehrdad, Dargahi, Javad

arXiv.org Artificial Intelligence

Cardiac catheterization remains a cornerstone of minimally invasive interventions, yet it continues to rely heavily on manual operation. Despite advances in robotic platforms, existing systems are predominantly follow-leader in nature, requiring continuous physician input and lacking intelligent autonomy. This dependency contributes to operator fatigue, more radiation exposure, and variability in procedural outcomes. This work moves towards autonomous catheter navigation by introducing DINO-CVA, a multimodal goal-conditioned behavior cloning framework. The proposed model fuses visual observations and joystick kinematics into a joint embedding space, enabling policies that are both vision-aware and kinematic-aware. Actions are predicted autoregressively from expert demonstrations, with goal conditioning guiding navigation toward specified destinations. A robotic experimental setup with a synthetic vascular phantom was designed to collect multimodal datasets and evaluate performance. Results show that DINO-CVA achieves high accuracy in predicting actions, matching the performance of a kinematics-only baseline while additionally grounding predictions in the anatomical environment. These findings establish the feasibility of multimodal, goal-conditioned architectures for catheter navigation, representing an important step toward reducing operator dependency and improving the reliability of catheterbased therapies.


A novel autonomous microplastics surveying robot for beach environments

Iqbal, Hassan, Rex, Kobiny, Shirley, Joseph, Baiz, Carlos, Claudel, Christian

arXiv.org Artificial Intelligence

Microplastics, defined as plastic particles smaller than 5 millimeters, have become a pervasive environmental contaminant that accumulates on beaches due to wind patterns and tidal forcing. Detecting microplastics and mapping their concentration in the wild remains one of the primary challenges in addressing this environmental issue. This paper introduces a novel robotic platform that automatically detects and chemically analyzes microplastics on beach surfaces. This mobile manipulator system scans areas for microplastics using a camera mounted on the robotic arm's end effector. The system effectively segments candidate microplastic particles on sand surfaces even in the presence of organic matter such as leaves and clams. Once a candidate microplastic particle is detected, the system steers a near-infrared (NIR) spectroscopic sensor onto the particle using both NIR and visual feedback to chemically analyze it in real-time. Through experiments in lab and beach environments, the system is shown to achieve an excellent positional precision in manipulation control and high microplastic classification accuracy.


Robotic Multimodal Data Acquisition for In-Field Deep Learning Estimation of Cover Crop Biomass

Johnson, Joe, Chalasani, Phanender, Shah, Arnav, Ray, Ram L., Bagavathiannan, Muthukumar

arXiv.org Artificial Intelligence

Accurate weed management is essential for mitigating significant crop yield losses, necessitating effective weed suppression strategies in agricultural systems. Integrating cover crops (CC) offers multiple benefits, including soil erosion reduction, weed suppression, decreased nitrogen requirements, and enhanced carbon sequestration, all of which are closely tied to the aboveground biomass (AGB) they produce. However, biomass production varies significantly due to microsite variability, making accurate estimation and mapping essential for identifying zones of poor weed suppression and optimizing targeted management strategies. To address this challenge, developing a comprehensive CC map, including its AGB distribution, will enable informed decision-making regarding weed control methods and optimal application rates. Manual visual inspection is impractical and labor-intensive, especially given the extensive field size and the wide diversity and variation of weed species and sizes. In this context, optical imagery and Light Detection and Ranging (LiDAR) data are two prominent sources with unique characteristics that enhance AGB estimation. This study introduces a ground robot-mounted multimodal sensor system designed for agricultural field mapping. The system integrates optical and LiDAR data, leveraging machine learning (ML) methods for data fusion to improve biomass predictions. The best ML-based model for dry AGB estimation achieved a coefficient of determination value of 0.88, demonstrating robust performance in diverse field conditions. This approach offers valuable insights for site-specific management, enabling precise weed suppression strategies and promoting sustainable farming practices.


ARChemist: Autonomous Robotic Chemistry System Architecture

Fakhruldeen, Hatem, Pizzuto, Gabriella, Glowacki, Jakub, Cooper, Andrew Ian

arXiv.org Artificial Intelligence

-- Automated laboratory experiments have the potential to propel new discoveries, while increasing reproducibility and improving scientists' safety when handling dangerous materials. However, many automated laboratory workflows have not fully leveraged the remarkable advancements in robotics and digital lab equipment. As a result, most robotic systems used in the labs are programmed specifically for a single experiment, often relying on proprietary architectures or using unconventional hardware. In this work, we tackle this problem by proposing a novel robotic system architecture specifically designed with and for chemists, which allows the scientist to easily reconfigure their setup for new experiments. Specifically, the system's strength is its ability to combine together heterogeneous robotic platforms with standard laboratory equipment to create different experimental setups. Finally, we show how the architecture can be used for specific laboratory experiments through case studies such as solubility screening and crystallisation. I. INTRODUCTION Accelerating the discovery of new materials is important for industrial applications such as healthcare and energy production. This can be achieved through running long-term experiments autonomously, for example by increasing the use of robotic platforms in laboratories. In practice, this would accumulate more experiments in less time, and potentially minimise the scientists' exposure to harmful chemicals, reducing their repetitive tasks.


UTTG_ A Universal Teleoperation Approach via Online Trajectory Generation

Fang, Shengjian, Zhou, Yixuan, Zheng, Yu, Jiang, Pengyu, Liu, Siyuan, Wang, Hesheng

arXiv.org Artificial Intelligence

Teleoperation is crucial for hazardous environment operations and serves as a key tool for collecting expert demonstrations in robot learning. However, existing methods face robotic hardware dependency and control frequency mismatches between teleoperation devices and robotic platforms. Our approach automatically extracts kinematic parameters from unified robot description format (URDF) files, and enables pluggable deployment across diverse robots through uniform interfaces. The proposed interpolation algorithm bridges the frequency gap between low-rate human inputs and high-frequency robotic control commands through online continuous trajectory generation, \n{while requiring no access to the closed, bottom-level control loop}. To enhance trajectory smoothness, we introduce a minimum-stretch spline that optimizes the motion quality. The system further provides precision and rapid modes to accommodate different task requirements. Experiments across various robotic platforms including dual-arm ones demonstrate generality and smooth operation performance of our methods. The code is developed in C++ with python interface, and available at https://github.com/IRMV-Manipulation-Group/UTTG.


Dexterous Control of an 11-DOF Redundant Robot for CT-Guided Needle Insertion With Task-Oriented Weighted Policies

Zhang, Peihan, Richter, Florian, Duriseti, Ishan, Yip, Michael

arXiv.org Artificial Intelligence

Computed tomography (CT)-guided needle biopsies are critical for diagnosing a range of conditions, including lung cancer, but present challenges such as limited in-bore space, prolonged procedure times, and radiation exposure. Robotic assistance offers a promising solution by improving needle trajectory accuracy, reducing radiation exposure, and enabling real-time adjustments. In our previous work, we introduced a redundant robotic platform designed for dexterous needle insertion within the confined CT bore. However, its limited base mobility restricts flexible deployment in clinical settings. In this study, we present an improved 11-degree-of-freedom (DOF) robotic system that integrates a 6-DOF robotic base with a 5-DOF cable-driven end-effector, significantly enhancing workspace flexibility and precision. With the hyper-redundant degrees of freedom, we introduce a weighted inverse kinematics controller with a two-stage priority scheme for large-scale movement and fine in-bore adjustments, along with a null-space control strategy to optimize dexterity. We validate our system through both simulation and real-world experiments, demonstrating superior tracking accuracy and enhanced manipulability in CT-guided procedures. The study provides a strong case for hyper-redundancy and null-space control formulations for robot-assisted needle biopsy scenarios.


FLEX: A Framework for Learning Robot-Agnostic Force-based Skills Involving Sustained Contact Object Manipulation

Fang, Shijie, Gao, Wenchang, Goel, Shivam, Thierauf, Christopher, Scheutz, Matthias, Sinapov, Jivko

arXiv.org Artificial Intelligence

Learning to manipulate objects efficiently, particularly those involving sustained contact (e.g., pushing, sliding) and articulated parts (e.g., drawers, doors), presents significant challenges. Traditional methods, such as robot-centric reinforcement learning (RL), imitation learning, and hybrid techniques, require massive training and often struggle to generalize across different objects and robot platforms. We propose a novel framework for learning object-centric manipulation policies in force space, decoupling the robot from the object. By directly applying forces to selected regions of the object, our method simplifies the action space, reduces unnecessary exploration, and decreases simulation overhead. This approach, trained in simulation on a small set of representative objects, captures object dynamics -- such as joint configurations -- allowing policies to generalize effectively to new, unseen objects. Decoupling these policies from robot-specific dynamics enables direct transfer to different robotic platforms (e.g., Kinova, Panda, UR5) without retraining. Our evaluations demonstrate that the method significantly outperforms baselines, achieving over an order of magnitude improvement in training efficiency compared to other state-of-the-art methods. Additionally, operating in force space enhances policy transferability across diverse robot platforms and object types. We further showcase the applicability of our method in a real-world robotic setting. For supplementary materials and videos, please visit: https://tufts-ai-robotics-group.github.io/FLEX/


Real-Time Neuromorphic Navigation: Guiding Physical Robots with Event-Based Sensing and Task-Specific Reconfigurable Autonomy Stack

Sanyal, Sourav, Joshi, Amogh, Kosta, Adarsh, Roy, Kaushik

arXiv.org Artificial Intelligence

Neuromorphic vision, inspired by biological neural systems, has recently gained significant attention for its potential in enhancing robotic autonomy. This paper presents a systematic exploration of a proposed Neuromorphic Navigation framework that uses event-based neuromorphic vision to enable efficient, real-time navigation in robotic systems. We discuss the core concepts of neuromorphic vision and navigation, highlighting their impact on improving robotic perception and decision-making. The proposed reconfigurable Neuromorphic Navigation framework adapts to the specific needs of both ground robots (Turtlebot) and aerial robots (Bebop2 quadrotor), addressing the task-specific design requirements (algorithms) for optimal performance across the autonomous navigation stack -- Perception, Planning, and Control. We demonstrate the versatility and the effectiveness of the framework through two case studies: a Turtlebot performing local replanning for real-time navigation and a Bebop2 quadrotor navigating through moving gates. Our work provides a scalable approach to task-specific, real-time robot autonomy leveraging neuromorphic systems, paving the way for energy-efficient autonomous navigation.


Active Illumination for Visual Ego-Motion Estimation in the Dark

Crocetti, Francesco, Dionigi, Alberto, Brilli, Raffaele, Costante, Gabriele, Valigi, Paolo

arXiv.org Artificial Intelligence

In this paper, we propose a novel active illumination framework to enhance the performance of VO and V-SLAM algorithms in these challenging conditions. The developed approach dynamically controls a moving light source to illuminate highly textured areas, thereby improving feature extraction and tracking. Specifically, a detector block, which incorporates a deep learning-based enhancing network, identifies regions with relevant features. Then, a pan-tilt controller is responsible for guiding the light beam toward these areas, so that to provide information-rich images to the ego-motion estimation algorithm. Experimental results on a real robotic platform demonstrate the effectiveness of the proposed method, showing a reduction in the pose estimation error up to 75% with respect to a traditional fixed lighting technique. I. INTRODUCTION Vision-based pose estimation is one of the most widespread strategies to achieve mobile robot localization. Several effective Visual Odometry (VO) and Visual SLAM (V -SLAM) approaches have flourished in the last decades [1], and the recent emergence of visual-inertial techniques has shown even more impressive results [2], [3]. The effectiveness of VO and V -SLAM solutions depends on the capability to extract robust and highly-descriptive visual features.